177 research outputs found

    Bayesian computation for statistical models with intractable normalizing constants

    Full text link
    This paper deals with some computational aspects in the Bayesian analysis of statistical models with intractable normalizing constants. In the presence of intractable normalizing constants in the likelihood function, traditional MCMC methods cannot be applied. We propose an approach to sample from such posterior distributions. The method can be thought as a Bayesian version of the MCMC-MLE approach of Geyer and Thompson (1992). To the best of our knowledge, this is the first general and asymptotically consistent Monte Carlo method for such problems. We illustrate the method with examples from image segmentation and social network modeling. We study as well the asymptotic behavior of the algorithm and obtain a strong law of large numbers for empirical averages.Comment: 20 pages, 4 figures, submitted for publicatio

    Fitting diversification models on undated or partially dated trees

    Get PDF
    A recommendation – based on reviews by Amaury Lambert, Dominik Schrempf and one anonymous reviewer – of the article: Didier, G. (2020) Probabilities of tree topologies with temporal constraints and diversification shifts. bioRxiv, 376756, ver. 4 peer-reviewed and recommended by PCI Evolutionary Biology. doi: 10.1101/376756. doi: 10.1101/37675

    PhyloBayes: Bayesian Phylogenetics Using Site-heterogeneous Models

    Get PDF
    International audiencePhyloBayes is a software program for Bayesian phylogenetic reconstruction. Compared to other programs, its main distinguishing feature is the implementation of the CAT model, which accounts for fine-grained variation across sites in amino acid preferences using a Bayesian non-parametric approach. This chapter provides a detailed step-by-step practical introduction to phylogenetic analyses using PhyloBayes, using as an example a previously published dataset addressing the phylogenetic position of Microsporidia within eukaryotes. Through this historically emblematic case of a long-branch attraction artifact, a complete analysis under site-homogeneous and site-heterogeneous models is conducted and interpreted, thus providing an illustration of why modeling pattern variation is so fundamental for reconstructing deep phylogenies

    Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model

    Get PDF
    BACKGROUND: Thanks to the large amount of signal contained in genome-wide sequence alignments, phylogenomic analyses are converging towards highly supported trees. However, high statistical support does not imply that the tree is accurate. Systematic errors, such as the Long Branch Attraction (LBA) artefact, can be misleading, in particular when the taxon sampling is poor, or the outgroup is distant. In an otherwise consistent probabilistic framework, systematic errors in genome-wide analyses can be traced back to model mis-specification problems, which suggests that better models of sequence evolution should be devised, that would be more robust to tree reconstruction artefacts, even under the most challenging conditions. METHODS: We focus on a well characterized LBA artefact analyzed in a previous phylogenomic study of the metazoan tree, in which two fast-evolving animal phyla, nematodes and platyhelminths, emerge either at the base of all other Bilateria, or within protostomes, depending on the outgroup. We use this artefactual result as a case study for comparing the robustness of two alternative models: a standard, site-homogeneous model, based on an empirical matrix of amino-acid replacement (WAG), and a site-heterogeneous mixture model (CAT). In parallel, we propose a posterior predictive test, allowing one to measure how well a model acknowledges sequence saturation. RESULTS: Adopting a Bayesian framework, we show that the LBA artefact observed under WAG disappears when the site-heterogeneous model CAT is used. Using cross-validation, we further demonstrate that CAT has a better statistical fit than WAG on this data set. Finally, using our statistical goodness-of-fit test, we show that CAT, but not WAG, correctly accounts for the overall level of saturation, and that this is due to a better estimation of site-specific amino-acid preferences. CONCLUSION: The CAT model appears to be more robust than WAG against LBA artefacts, essentially because it correctly anticipates the high probability of convergences and reversions implied by the small effective size of the amino-acid alphabet at each site of the alignment. More generally, our results provide strong evidence that site-specificities in the substitution process need be accounted for in order to obtain more reliable phylogenetic trees

    Feministisch Flaneren in Film: Van de mannelijke voyeur naar een vrouwelijk flanerend subject in avant-gardefilm van de jaren zestig tot nu.

    Get PDF
    Degene die kijkt heeft de macht, degene die bekeken wordt is machteloos. Deze dynamiek hoort bij de klassieke cinema die de mannelijke voyeuristische blik produceert en die de vrouw objectiveert en haar bekijkt enkel voor het eigen kijkplezier. Dit is wat ik in het eerste deel (hoofdstuk 1 en 2) van mijn onderzoek blootleg door middel van de klassieke filmtheorie aan de hand van Christian Metz’ theorie van de voyeur en het feministische antwoord op de klassieke filmtheorie van Laura Mulvey. De vraag rijst: Hoe kan de vrouw een actieve subject positie innemen in film? Op zoek naar een alternatieve modus van observatie voor een vrouwelijk subject in film zet ik in deel twee (hoofdstuk 2 en 3) de voyeur af tegen het concept van de flĂąneur. De flĂąneur is een historisch, subversief mannelijk figuur dat nauw verbonden is met de moderniteit en het stadse leven, en die het privilege geniet om vrijelijk te kijken en zich door de metropool te bewegen. Binnen het feministische debat zijn er ook vrouwelijke varianten van de flĂąneur beschreven. Sommige theoretici staan echter kritisch tegenover de vrouwelijke flĂąneur en achten haar bestaan alleen mogelijk in de vorm van prostituee of consument. Daarentegen zijn er enkelen met mij optimistisch over de mogelijkheid van een vrouwelijke flĂąneur in film als modus voor een vrouwelijke subjectpositie. De volgende stap in dit onderzoek is het loskoppelen de historische flĂąneur van de constitutieve eigenschappen, zoals voorgesteld door Ilija Tomanić TrivundĆŸa, om de feministische potentie van een vrouwelijke flĂąneur te onderzoeken. In navolging van Anke Gleber betoog ik dat een vrouwelijk flanerend subject in film zich niet beperkt tot een letterlijk flanerende vrouwelijke protagonist, maar ook ruimte maakt voor figuurlijk flĂąneren, zoals in de vorm van toeschouwer in de bioscoop, met het oog achter de camera, of als filmregisseur. Aan de hand van een aantal vergelijkende analyses ontwaar ik zowel letterlijk en figuurlijk flanerende vrouwen in avant-gardefilms, zoals in Rape (1969) van Yoko Ono en ClĂ©o de 5 Ă  7 (1962) van AgnĂšs Varda, vind ik een lesbische flĂąneur in Portrait d'une jeune fille de la fin des annĂ©es 60 Ă  Bruxelles (1994) van Chantal Akerman, een queer-flĂąneur in Night Soil - Economy of Love (2015) van Melanie Bonajo, en eindig mijn onderzoek (hoofdstuk 4) met een gekleurde vrouwelijke flĂąneur in de Strolling- serie van Cecile Emeke. Met deze casussen toon ik aan dat een vrouwelijk flanerend subject in film inclusief feministische potentie bevat

    Fast optimization of statistical potentials for structurally constrained phylogenetic models

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Statistical approaches for <it>protein design </it>are relevant in the field of molecular evolutionary studies. In recent years, new, so-called structurally constrained (<it>SC</it>) models of protein-coding sequence evolution have been proposed, which use statistical potentials to assess sequence-structure compatibility. In a previous work, we defined a statistical framework for optimizing knowledge-based potentials especially suited to SC models. Our method used the maximum likelihood principle and provided what we call the <it>joint </it>potentials. However, the method required numerical estimations by the use of computationally heavy <it>Markov Chain Monte Carlo </it>sampling algorithms.</p> <p>Results</p> <p>Here, we develop an alternative optimization procedure, based on a <it>leave-one-out </it>argument coupled to fast gradient descent algorithms. We assess that the leave-one-out potential yields very similar results to the joint approach developed previously, both in terms of the resulting potential parameters, and by Bayes factor evaluation in a phylogenetic context. On the other hand, the leave-one-out approach results in a considerable computational benefit (up to a 1,000 fold decrease in computational time for the optimization procedure).</p> <p>Conclusion</p> <p>Due to its computational speed, the optimization method we propose offers an attractive alternative for the design and empirical evaluation of alternative forms of potentials, using large data sets and high-dimensional parameterizations.</p

    The Bayesian Approach to Molecular Phylogeny

    Get PDF
    International audienceBayesian inference is now routinely used in phylogenomics and, more generally, in macro-evolutionary studies. Beyond the philosophical debates it has raised concerning the choice of the prior and the meaning of posterior probabilities, Bayesian inference, combined with generic Monte Carlo algorithms, offers a flexible framework for introducing subjective or context information through the prior, but also, for designing hierarchical models formalizing complex patterns of variation (across sites or branches) or the integration of multiple levels of evolutionary processes. In this chapter, the principles of Bayesian inference, such as applied to phylogenetic reconstruction , are first introduced, with an emphasis on the key features of the Bayesian paradigm that explain its flexibility in terms of model design and its robustness in inferring complex patterns and processes. A more specific focus is then put on the question of modeling pattern-heterogeneity across sites, using both parametric and non-parametric random-effect models. Finally, the current computational challenges are discussed

    A maximum likelihood framework for protein design

    Get PDF
    BACKGROUND: The aim of protein design is to predict amino-acid sequences compatible with a given target structure. Traditionally envisioned as a purely thermodynamic question, this problem can also be understood in a wider context, where additional constraints are captured by learning the sequence patterns displayed by natural proteins of known conformation. In this latter perspective, however, we still need a theoretical formalization of the question, leading to general and efficient learning methods, and allowing for the selection of fast and accurate objective functions quantifying sequence/structure compatibility. RESULTS: We propose a formulation of the protein design problem in terms of model-based statistical inference. Our framework uses the maximum likelihood principle to optimize the unknown parameters of a statistical potential, which we call an inverse potential to contrast with classical potentials used for structure prediction. We propose an implementation based on Markov chain Monte Carlo, in which the likelihood is maximized by gradient descent and is numerically estimated by thermodynamic integration. The fit of the models is evaluated by cross-validation. We apply this to a simple pairwise contact potential, supplemented with a solvent-accessibility term, and show that the resulting models have a better predictive power than currently available pairwise potentials. Furthermore, the model comparison method presented here allows one to measure the relative contribution of each component of the potential, and to choose the optimal number of accessibility classes, which turns out to be much higher than classically considered. CONCLUSION: Altogether, this reformulation makes it possible to test a wide diversity of models, using different forms of potentials, or accounting for other factors than just the constraint of thermodynamic stability. Ultimately, such model-based statistical analyses may help to understand the forces shaping protein sequences, and driving their evolution

    RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language.

    Get PDF
    Programs for Bayesian inference of phylogeny currently implement a unique and ïŹxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciïŹed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciïŹcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ïŹ‚exibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ïŹeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]
    • 

    corecore